Time series analyses based on the joint lagged effect analysis of pollution and meteorological factors of hemorrhagic fever with renal syndrome and the construction of prediction model

Background Hemorrhagic fever with renal syndrome (HFRS) is a rodent-related zoonotic disease induced by hantavirus. Previous studies have identified the influence of meteorological factors on the onset of HFRS, but few studies have focused on the stratified analysis of the lagged effects and interactions of pollution and meteorological factors on HFRS. Methods We collected meteorological, contaminant and epidemiological data on cases of HFRS in Shenyang from 2005–2019. A seasonal autoregressive integrated moving average (SARIMA) model was used to predict the incidence of HFRS and compared with Holt-Winters three-parameter exponential smoothing model. A distributed lag nonlinear model (DLNM) with a maximum lag period of 16 days was applied to assess the lag, stratification and extreme effects of pollution and meteorological factors on HFRS cases, followed by a generalized additive model (GAM) to explore the interaction of SO2 and two other meteorological factors on HFRS cases. Results The SARIMA monthly model has better fit and forecasting power than its own quarterly model and the Holt-Winters model, with an optimal model of (1,1,0) (2,1,0)12. Overall, environmental factors including humidity, wind speed and SO2 were correlated with the onset of HFRS and there was a non-linear exposure-lag-response association. Extremely high SO2 increased the risk of HFRS incidence, with the maximum RR values: 2.583 (95%CI:1.145,5.827). Extremely low windy and low SO2 played a significant protective role on HFRS infection, with the minimum RR values: 0.487 (95%CI:0.260,0.912) and 0.577 (95%CI:0.370,0.898), respectively. Interaction indicated that the risk of HFRS infection reached its highest when increasing daily SO2 and decreasing humidity. Conclusions The SARIMA model may help to enhance the forecast of monthly HFRS incidence based on a long-range dataset. Our study had shown that environmental factors such as humidity and SO2 have a delayed effect on the occurrence of HFRS and that the effect of humidity can be influenced by SO2 and wind speed. Public health professionals should take greater care in controlling HFRS in low humidity, low windy conditions and 2–3 days after SO2 levels above 200 μg/m3.


Introduction
This hantavirus-like infection has attracted world attention during the Korean War since it was first described in Chinese texts 900 years ago [1]. Two clinical syndromes caused by hantavirus infection have been characterized: hantavirus cardiopulmonary syndrome(HCPS), prevalent mainly in America, and Hemorrhagic fever with renal syndrome(HFRS), found in Eurasia [2]. HFRS [1], characterized by headache, fever, back pain, abdominal pain and acute renal insufficiency [3], has caused a variety of public problems, with 30,000-60,000 cases per year in mainland China in the 1990s [4]. In Europe, over 9000 cases of HFRS are reported annually, and most cases associated with HFRS are diagnosed in parts of Europe in Russia, Finland and Sweden [5,6]. China has the highest rate of HFRS disease in the world, with domestic HFRS cases constituting 90% of the total number of cases worldwide each year [7,8]. HFRS is a zoonotic disease associated with rodents and a legally reported disease in China [9].
In the past, descriptive statistical methods were mostly used for the analysis of infectious diseases such as HFRS. Nonetheless, effective prediction of short and medium term HFRS incidence can provide a reliable basis for the Center of Disease Control (CDC), as well as scientific theory and support for national infectious disease prevention and control planning [10]. With the application of big data prediction technology in various fields, it also supplies approaches for the development of HFRS prediction technology. At present, there are many ways to study data prediction technology at home and abroad, and they have been broadly applied good results [11]. Regarding infectious disease forecasting, autoregressive integrated moving average (ARIMA) and Holt-Winters models are one of the most representative and widely used models in time series forecasting [12]. Today, predictive modeling has been used in many studies in epidemiological research. Some researchers applied the seasonal autoregressive fractionally integrated moving average (SARFIMA) model to predict renal syndrome hemorrhagic fever [13]. Furthermore, a European study had applied mathematical modeling to explore the pathogenesis and impact of influenza and pathogens [14]. Junyu He et al. applied prophet and ARIMA models in 2022 to evaluate the predictive effect of HFRS incidence in the Chinese region from 1950-2018 [15].
China began to show a declining trend in HFRS cases in the early 1990s and the annual number has fallen more significantly since 2000, from 37,814 cases in 2000 to 11,248 cases in 2007 [4,16]. Seven hantavirus serotypes/genotypes have been identified in China [17] Of these, Hantavirus and Seoul virus are the main pathogens of HFRS, and cases caused by Hantavirus account for about 70% of domestic HFRS cases [18]. Climate and environmental changes might impact the reservoir ecology and dynamics of rodent carriers, thereby triggering the spread of hantavirus transmission [4,19].
Climatic conditions are broadly regarded as some of the most pivotal factors affecting rodent population dynamics and contributing to more cases of HFRS in humans [20]. Some studies have found a correlation between the climatic factors and HFRS. For instance, a systematic evaluation of climate variability and human hantavirus infection in Europe was previously carried out by J. Roda Gracia et al. In 2010, some researches in China used time-series Poisson regression model to examine the independent effect of climate variables on the spread of HFRS, pointing out the important role of climate variation in the transmission of HFRS in northeastern China [21].It will contribute to future international discussions on zoonotic diseases in the context of climate change [22]. Yizhe Luo et al. found that temperature with a lag of 6 months (RR = 3.05) and precipitation with a lag of 0 months (RR = 2.08) had the greatest effect on the incidence of HFRS [23]. Also, recent Chinese studies have shown that temperature and relative humidity have an approximately parabolic or linear effect on the incidence of HFRS in 2022 [24]. Yet little research has been done on the correlation between pollutants and HFRS epidemics. And few studies have synthesized the lagged effects of diverse environmental variables on the onset of HFRS and analyzed the interactions among them. Therefore, we explored the relationship between meteorological and pollutant factors and the onset of HFRS, and speculated that the interaction among environmental factors is of attention for HFRS.
In this study, the incidence rate was predicted by comparison using time series analysis using HFRS surveillance data in Shenyang, followed by the application of boosted regression trees (BRT) verifying the fit and interaction among the environmental factors, and the lag effect and interaction of meteorological and pollutant factors were investigated using distributed lag nonlinear models (DLNM) and generalized additive models (GAM).

Data collection
Ethics statement. The present study was approved by Shenyang Center for Disease Control and Prevention (CDC). All the HFRS data were anonymously analyzed for the consideration of confidentiality.
Setting. Shenyang is the capital city of Liaoning Province. It is a district-level city in China, covering both urban and rural locations. Shenyang is located in latitude 41˚11 0 -430 2 0 N and longitude 122˚25 0 -123˚48 0 E, measures 12,860 Sq km and composed of 13 districts and 214 towns [25]. In 2019, Shenyang City's average population was 7,511,923. Shenyang city belongs to the temperate semi-humid continental climate zone. The geographical situation of Shenyang is indicated in S1 Fig.

The HFRS dataset
We obtained surveillance data on cases of HFRS in Shenyang between 2005 and 2019 from CDC of Shenyang. All patients were diagnosed according to the Criteria and Management Principles of Renal Syndrome Hemorrhagic Fever issued by the Ministry of Health of the People's Republic of China (Ministry of Health 1998). The number of HFRS morbidity was diagnosed according to Diagnostic Criteria and Principles of Management of Epidemic Hemorrhagic Fever (GB 15996-1995). For an HFRS case to be confirmed, the affected person must have either traveled to an endemic area or had contact with rodent feces, saliva, or urine within the two months preceding the onset of their illness. They must have experienced an acute illness characterized by abrupt onset of at least two of the following clinical features: fever, chills, hemorrhage, headache, back pain, abdominal pain, acute renal dysfunction, and hypotension. Furthermore, they must have received at least one laboratory confirmation test, such as a positive result for hantavirus-specific immunoglobulin M, a four-fold increase in titers of hantavirus-specific immunoglobulin G between acute and convalescent serum samples, positive hantavirus-specific ribonucleic acid results by reverse transcription polymerase chain reaction in clinical specimens, or hantavirus isolated from clinical specimens, to meet the diagnostic criteria for HFRS [21,26].

Pollution and meteorological factors data
We obtained daily weather data for 2005 to 2019 from the China Meteorological Data Sharing Service. The weather dataset is available from China Meteorological Data Sharing Service System (http://data.cma.cn). Meteorological factors include air pressure, sunshine, air temperature, air humidity, wind speed and rainfall. In 2013, the national air pollution population health impact monitoring project was officially launched, and Liaoning Province, one of the 16 provinces with heavy air pollution levels, also participated in the air pollution population health impact monitoring project, and Shenyang became the first monitoring city in Liaoning Province. Pollutants information for 2014 to 2019 were originally from 11 state-controlled environmental air quality automatic monitoring stations through the website of Shenyang Bureau of Ecology and Environment due to ensure the accuracy [27]. Pollutants include PM 2.5 , PM 10 , SO 2 , NO 2 , CO and O 3 .

SARIMA and Holt-Winters model construction
Based on the quarterly and monthly data series, additive and multiplicative models were adopted to build factor decomposition models respectively, followed by the application of simple central moving average method to decompose the following four factor maps respectively: (1) long-term trend. Auto regressive integrated moving average model [10] is a time series forecasting method proposed by Geogre Box and Gwilym Jenkins. The ARIMA model is a classical time-series analysis method and is extensively used. The SARIMA model is developed on the foundation of the ARIMA model. The SARIMA model is based on a further development of the ARIMA model, which is particularly suitable for cases where both trend and seasonality are present in the series. The SARIMA model is abbreviated as SARIMA (P, D, Q) S , where p and q are the orders of autoregressive and moving average, P and Q are the orders of seasonal autoregressive and moving average, d is the number of variances, D is the number of seasonal variances, and S is the seasonal period and cycle length [9]. The construction of the SARIMA model is shown as following equation: where L is the delay operator, A P (L s ) is the p-order autoregressive operator, A q (L s ) is the qorder seasonal moving average operator, r d = (1-L) d is the difference operation, and rD s = (1-L s ) d is the seasonal difference operation. The order of approximation of the model is determined based on the autocorrelation function. The QAIC information criterion is then used to determine the best combination of parameters for the model, and the model satisfies the residual white noise test [28].
The Holt-Winters model is a forecasting technique proposed by Holt and Winters in 1960 that is based on speculative smoothing. Unlike ARIMA, Holt's linear equation has a built-in equation for seasonal factors that directly captures seasonality [29].
Three smoothing equations are used to calculate and evaluate deseasonalized series, trends and seasonality variables. The Holt-Winters' additive method can be written as follows: where t = 1,. . ., n, S represents the length of seasonality (months), L t represents the level of the series, and b t denotes the trend and seasonal components. The constants used in this model are α (horizontal smoothing constant), γ (trend smoothing constant) and δ (seasonal smoothing constant) [30].

BRT model construction
BRT methods have been successfully applied to research fields such as disease modeling [31]. The BRT method produces a series of trees, each of which grows on the remnant of the previous tree. Recent studies have shown that the BRT model can explain the interactions between exposures in observational studies [32]. In the BRT model, f(x) is an evaluation of the response y based on a vector predictor of x, which in turn is integrated as an additive form of b(x; γ m ), as follows: where β m is the expansion factor and b(x; γ m ) represent the individual trees with parameters y and variables x. The coefficient β m reflects the weights allocated to the nodes of each tree and identifies the type of combination predicted for each tree. In this approach, the three regularization parameters, number of trees, learning rate (lr) and tree complexity (tc), should be optimized. The complexity (tc) should be optimized [33]. To this purpose, in this study, various nt, tc (1-10), and lr (0.001, 0.05, and 0.01) are allocated to the training of the BRT model in order to maximize the model performance [34].

DLNM model and GAM construction
DLNM has been extensively used to assess the exposure-lag-response relationship between environmental factors and human diseases, such as congenital heart disease, HFRS, non-accidental deaths and so on [35,36]. The model can be written as follows: To analyze the lag and extreme effects of environmental factors, Humidity, wind speed and SO 2 were taken and applied to the cross-basis functions of DLNM. Here, Y t was the number of daily counts of HFRS cases in daily t; α was the intercept of the whole model; NS is a natural cubic spline that acts as a smooth function of the model; M represents the estimated environmental variable related to HFRS; X t is the other environmental variables in the pathogenesis of HFRS that requires nonlinear confounding effect adjustment; NS was used to adjust for daily confounding in the model; DOW is a categorical variable for day of week; Holiday is a binary variable used to control the effect of Chinese public holidays, β and γ are the regression coefficients; The optimal degrees of freedom (df) for the spline function were estimated by Akaike information criterion for quasi-Poisson (Q-AIC) and minimum partial regression coefficient (PACF min ) criteria; NS of 4,6 and 8 df were used for wind speed, SO 2 and relative humidity respectively, and the lag space was set to 3 df. NS with 5df/year was applied to time variable. In addition, as the incubation period for human hantavirus infection is typically 7-14 days, our model applied the Q-AIC guidelines using a delay of up to 16 days.
In our study, the median environmental variable has been used as a reference value to compute the relative risk. We then compared the 25th and 75th percentile of each environmental variable ("low" and "high") to its median value, in order to explore the stratified effect of modification and qualitatively study the association among environmental factors and HFRS cases. The impact of environmental factors was analyzed by stratifying by gender, age group and number of diagnostic delayed days in order to identify susceptible populations and their corresponding sensitivities.
Subsequently, the GAM method was used to explore the interaction of meteorological and pollutant factors on the prevalence of HFRS. The model can be written as follows: α2 is the intercept; X 1 indicates one of the environmental factors (humidity, wind speed and SO 2 ) whereas X 2 and X 3 denote the other two; s () indicates penalized spline function. s 1 (X 1 , X 2 ) is a spline function of the interaction between the parameters X 1 and X 2 . (X 1 , X 2 , X 3 are all 16 lagged variables.)

Statistical analysis
As there were missing values in the incidence data of HFRS in Shenyang, we performed linear interpolation to compensate as soon as possible in order to better apply Box-Jenkins and exponential smoothing methods for incidence prediction. A total of 10 sets of data were missing, located in the years 2011-2019, mainly in 2016 and 2019. Of these, only a single month was missing in 2011, 2012, 2014, 2015, and 2017, while two months were missing in 2016 (September-October) and three months were missing in 2019 (February, July, and October). For the influence of meteorological and pollutant factors on the number of HFRS cases, Spearman correlation analysis was used for feature selection, followed by BRT to fit the selected features to the variables and interaction tests. We developed a DLNM with a maximum lag of 16 days to evaluate the lagged, stratification and extreme effects of pollution and meteorological factors on the cases of HFRS. A GAM then was established to explore the interaction of SO 2 and two other meteorological factors on HFRS cases.
All analyses in our study were performed in R software (version 4.1.3).

Descriptive characteristics of HFRS cases and environmental factors
A total of 1,880 cases of HFRS were reported in Shenyang from 2005-2019, of which the incidence of HFRS was predominantly in young adults aged 20-50, accounting for 71.81% of all cases. Men are more susceptible than women at a ratio of 3.67:1 (1477:403). For 2005-2019, the incidence of HFRS in Shenyang was 25.03 per 100,000, and the case fatality rate was 0.691% ( Table 1). The onset of HFRS showed significant differences in seasonality, age, and delayed days in diagnosis of onset (p<0.05) ( Table 2). From Table 2 meaningful subgroups differed in seasonal distribution, showing that different age groups were mainly concentrated in summer and winter, while groups with different days of delayed onset were mainly concentrated in spring and winter. Summary statistics of all HFRS cases and environmental variables in Shenyang are shown in S1 Table.

Time-series analysis of HFRS% of Holt-Winters and SARIMA model in monthly and seasonal prediction
From the factor decomposition diagrams in S2 Fig, the annual incidence rate of HFRS is trending down, and after removing the trend effect from the original series, the difference in the average seasonal index among different quarters is the difference caused by the seasonal effect. HFRS in Shenyang peaked twice a year in the first and fourth quarter. The monthly HFRS has the characteristics of bimodal distribution, with the first peak in March-May, the second peak in November-December, and the seasonal inelasticity rising from April-August. The incidence of HFRS in Shenyang is characterized by a bimodal monthly distribution, with the first peak in March-May, the second peak in November-December, and an exponential decline in April-August, and a seasonal inverse rise starting in September. The model fixed-order plots in S3 Fig allow the SARIMA seasonal and monthly models to be parameterized for the ACF, PACF plots, combined with the "auto.arima()" function to correct for the AR and MA parameters (parameter estimates < 2 times the sample standard deviation). The model parameters and tests are shown in Table 3, it gives the forecasting accuracy of two models for the HFRS series. The SARIMA model has lower values for RMSE, MAE and MAPE, which means the SARIMA is more accurate.
After building a suitable model based on the model parameters, the incidence of HFRS from 2005-2018 was used as the training set and 2019 was used as the validation set to validate the model established by the training set. The Holt-Winters and SARIMA models were applied to predict the incidence of HFRS, respectively, and the prediction effects were plotted as obtained in Fig 2 and

Feature selection and fitting of environmental factors for HFRS
Spearman correlation analysis showed that HFRS was significantly correlated with humidity (r = -0.10, p<0.01), wind speed (r = 0.07, p<0.05) and SO 2 (r = 0.09, p<0.01) (S3 Table). Furthermore, to fit the BRT model, we set the model parameters: tree complexing was 5, learning rate was 0.005, and bag. fraction was 0.5. According to S6 Fig, the degree of fit of each of the three environmental factors fitting functions was like the Spearman correlation results, and the trend with the number of HFRS incidence was significant.
Depending on the distribution of observations in the environment space, the fitting function can give a distribution of fitted values relating to each predictor. The values at the top of each graph indicate the weighted average of the fitted values associated with each non-factor predictor. According to the interaction fitting function in Fig 3A and 3B, environmental factors fit better at moderate levels of interaction.

PLOS NEGLECTED TROPICAL DISEASES
Prediction, impacts and interaction of meteorological and pollution variables for HFRS

Exposure-response relationships for environmental factors with different lag times
As shown in Fig 4, the effects of humidity and SO 2 on HFRS differed across lag times and stratification factors when the study lag time points were 0 and 16 days. The RR of the effect of humidity on HFRS cases tended to increase at lag 0 and 16 days, with humidity RR values reaching a maximum at 20-40% and above 90% at lag 0, while lag 16 days only showed a maximum RR value at high humidity. The effect values of wind speed and SO 2 on HFRS cases at lags 0 and 16 days showed a trend of increasing and then decreasing RR, with RR values in the range of 2-4m/s and 200-250μg/m 3 , respectively. Within the different grouping intervals, the trend of RR effect values for humidity within Delay 0-4 days was slightly different from the overall, and the RR effect values for wind speed within Female, Delay 0-4, Delay 5-9 days, and SO 2 within Delay 0-4 days were significantly different.

Effects of extreme environmental variables on HFRS cases
To determine the effect of extreme environmental factors on the HFRS, the estimated effects were examined by comparing the 25th or 75th percentile of relative humidity, wind speed and  Table 4 shows the cumulative impact of the lag factor extremes on the HFRS at 16 lag days. We found that extreme high levels of SO 2 were positively linked to the onset of HFRS, while extreme low levels of SO 2 , with no wind effect, had a protective effect, and the RR values of cumulative effects were 2.583 (1.145,5.827) for high SO 2 effect and 0.577 (0.370, 0.898) for cold SO 2 effect. At 16 lag days, significant cumulative effects of low windy conditions were observed in males (RR value: 0.490 (0.241,0.997)), in the over 50 years age group (RR value: 0.335 (0.113,0.992)) and in delayed onset for over 10 days (RR value: 0.324 (0.106,0.983)). In turn, women at extreme SO 2 levels and patients with a delayed onset of 5-9 days are susceptible, with their RR values: 8 The distributed lagged effects of extreme environmental factors at various lag days for all groups were showed in Fig 5. We found that the dry effect indicated a maximum RR value on the current day, peaking at 4 lag days and then showed a U-shaped curve along the lag days, and the RR value subsequently decreased for the next days and then turned to rise along the lag days, whereas wet effect showed the opposite trend. The curve of dry and wet effect was roughly similar among different stratified groups. On the low windy effect, the 2 lag days is a peak followed by a decline, while the opposite is true for the high windy effect. The overall patients reached their highest effect at extreme high or low levels of SO 2 , usually at a lag of 1 day, followed by a gradual downward trend. female, aged >1 years and delayed 5-9 days remain the most sensitive people.

Environmental interaction during humidity, wind speed, SO 2 and HFRS cases
GAMs were built to show the interaction effect among humidity, wind speed and SO 2 on HFRS incidence (Fig 6). The program on the top side of Fig 6A and 6B shows the interaction effect of wind speed and humidity on HFRS. The HFRS infection risk increased as daily wind speed and humidity decreased. The plot to the bottom of Fig 6C and 6D indicates the interaction effect of SO 2 and humidity, HFRS tends to occur in higher SO 2 and lower humidity environmental conditions.

Discussion
From the time series and seasonal decomposition of the incidence of HFRS in Shenyang, combined with studies in various regions of China [37], it can be seen that there is a clear seasonal

PLOS NEGLECTED TROPICAL DISEASES
Prediction, impacts and interaction of meteorological and pollution variables for HFRS trend in the incidence of HFRS in China, with a decreasing trend year upon year. The peak in case reporting differs between single and double peaks across regions, with the main peak occurring between March and May each year. This study also shows a second peak in November to December, which may occur for the following reasons: (1) The peak of population movement is about March each year, when the mobility of urban life becomes more complex and the fast pace of life and consumption leads to a gradual decline in the demand for health.
(2) After November, the cooler temperatures in the city lead to larger crowds and a greater temperature difference between indoors and outdoors, allowing host animals to enter human life more closely and the disease to be more contagious.
Currently, many scholars have conducted research on predictive models for the onset of infectious diseases. The SARIMA model had been used to predict the incidence of HFRS [13,38]. It shows that SARIMA has the characteristics of being unconstrained by data type and high applicability, integrating factors such as trend, periodicity and random error, so that it can be used in prediction studies of infectious diseases with periodic morbidity characteristics. Pritthijit Nathet al. applied both the SARIMA model and the Holt-winters seasonal model for the prediction of airborne particulate matter in eastern India, the Holt-winters model was considered to be simple in principle and had a high predictive accuracy for diseases with a cyclical pattern of onset [39]. In this study we discussed the effect of the SARIMA model applied to the HFRS series and compared it with the Holt-Winters model. In terms of model mechanism, the SARIMA model is suitable for predicting series that are smooth and stable over time compared to the Holt-winters model, which is suitable for predicting models with a single trend of change. However, research on time series has limitations, as both methods in this study are extrapolated forecasts based on historical data, usually considering the characteristics of the series itself, and cannot predict sudden changes in the data due to changes in external factors. Moreover, the occurrence and prevalence of infectious diseases are influenced by multiple natural factors, climate and other social factors. Our study thus explored the lagging, interactive and stratified effects of meteorological and pollutant factors on the prevalence of HFRS in Shenyang.
Firstly, we need to select and fit the features using the spearman method combined with a BRT. The results were chosen from three environmental factors, wind speed, humidity and SO 2 , in order to achieve a precise study of the influence of environmental factors on HFRS. Subsequently, the DLNM method was applied to examine the exposure-lag-response relationship between the average daily cases of HFRS disease and environmental factors in Shenyang from 2014-2019. The results showed a non-linear lagged relationship among meteorological, pollutant factors and HFRS. Extremely high concentration levels of SO 2 increased the risk of

PLOS NEGLECTED TROPICAL DISEASES
Prediction, impacts and interaction of meteorological and pollution variables for HFRS HFRS, while low windy and low concentration levels of SO 2 were protective against HFRS from 0-16d. It was also found that the lagged effects of different climatic and pollutant factors were not identical. The different delay periods reflect the fact that the lagged effect of each environmental variable may be related to the spread of infection influenced by various factors, including the growth of the virus in the external environment, the inclination of people to go outside, and seasonal changes in rodent populations [21,40]. We demonstrated that high concentrations of SO 2 significantly influenced the spread of HFRS after 0-5 and 15 lag days.

PLOS NEGLECTED TROPICAL DISEASES
Prediction, impacts and interaction of meteorological and pollution variables for HFRS Extreme low windy were strongly associated with HFRS from lag 0 to a maximum lag 16 days, suggesting that the incidence of HFRS may be lagged by approximately 16 days at low windy. It could be similar to the study by Zhang [41] et al. on HFMD, mainly because low windy may inhibit the spread of hantavirus-containing particles [42]. The effect of humidity on HFRS cases tended to increase at days 0 and 16 when the lag time was 16 days, with the greatest effect at 20-40% at lag 0 days and at over 90% at lag 16 days. Higher humidity levels may indicate that humidity affects the survival of the rodent host, in addition to affecting the infection and stability of the virus in the in vitro environment [21]. Furthermore, in contrast to previous studies, our study included pollutants in the HFRS influencing factors and explored the non-linear lag between SO 2 and HFRS. Our study found that elevated SO 2 concentrations increased the risk of HFRS infection at levels in the range of 200-250 μg/m 3 and were significant in women and in patients with a delayed onset of 5-9 days. There is still controversy about the effect of SO 2 on HFRS, which may be related to regional, population differences and the proportion of pollutants in the air. However, regarding the effect of SO 2 on other infectious diseases, there were different reports showing a significant protective effect of SO 2 against influenza [43](RR = 0.892, 95% CI: 0.840-0.948), which probably due to the higher outdoor pollutants, resulting in a population more dependent on the indoor environment and less exposed to the virus. Stratified analysis showed that the effects of meteorological and pollutant factors varied by sex, age group, and number of days delayed onset. Men were more sensitive to extreme low windy than women, and women were more sensitive to extreme SO 2 concentration levels than men. Similar results have been reported in several studies on other infectious diseases [43]. Patients over 50 years were more significantly affected by extreme low windy and showed a protective effect, as compared to other age groups. But based on previous studies, wind speed effects were generally significantly associated with lower age [44]. This study showed that HFRS mainly affected people under 50 years of age at low windy, which might be attributed to underlying factors such as social factors, population distribution, etc. In terms of delay days, patients with a delay of 5-9 days were more sensitive to extreme SO 2 concentrations and patients with a delay of over 10 days were more susceptible to extreme low wind speed. The delay between the onset of HFRS and the time of diagnosis led to a lagging effect of environmental factors reinforced by the length of time the patient spent in the environment after onset. Additionally, we did not take temperature, barometric pressure into account in our study of the correlation between environmental factors and HFRS. Some studies had shown that mean and extreme temperatures were negatively correlated with cases of HFRS [20]. This study did not discuss the relationship with HFRS cases in terms of temperature, barometric pressure, and rainfall factors, showing different findings of HFRS in Shenyang before 2011 and after 2014, suggesting possible spatial and temporal variability.
The results of the interaction analysis showed that higher SO 2 and lower humidity environments were the dangerous environmental conditions for the occurrence of HFRS. It was demonstrated that NO X and SO 2 in the air showed strong seasonal variations and that their concentrations were closely related to meteorological factors such as wind speed, temperature and relative humidity. Air pollution may impact the frequency of HFRS cases by modifying viral infectivity and immunity in humans and rodents [45,46].The combined effect of low windy and low humidity also affected the development of HFRS disease. Analysis of the geographical distribution of the country suggested that this result could be attributed to the region's location in a climatic zone [47].
Our research benefits cover: (1) the study period is long, and the study collected case and environmental data over many years. (2) For the time series analysis, we applied two different models for comparison and also split the data into monthly and seasonal data for accurate comparison and forecasting. (3) Our study applies advanced statistical methods, not only applying spearman to feature selection, but also applying the BRT method to fit the screened variables and their interactions, followed by DLNM and GAM to analyze the lagged, extreme and cumulative effects of environmental factors. Our findings can provide evidence and guidance on the lagged effects and interactions of environmental factors on HFRS. It is worth pointing out that there were some limitations to our study. Firstly, there are cases of HFRS in this study that have been diagnosed both clinically and through the laboratory, which may be subject to diagnostic bias and are under-reported. Moreover, due to the regional limitations of this study, other regions should be referred to with caution in studying the impact and prediction of HFRS disease, considering regional characteristics, and making changes in model selection, parameters and factor selection.